Hypothesis Testing For Densities and High-Dimensional Multinomials: Sharp Local Minimax Rates
نویسندگان
چکیده
We consider the goodness-of-fit testing problem of distinguishing whether the data are drawn from a specified distribution, versus a composite alternative separated from the null in the total variation metric. In the discrete case, we consider goodness-of-fit testing when the null distribution has a possibly growing or unbounded number of categories. In the continuous case, we consider testing a Lipschitz density, with possibly unbounded support, in the low-smoothness regime where the Lipschitz parameter is not assumed to be constant. In contrast to existing results, we show that the minimax rate and critical testing radius in these settings depend strongly, and in a precise way, on the null distribution being tested and this motivates the study of the (local) minimax rate as a function of the null distribution. For multinomials the local minimax rate was recently studied in the work of Valiant and Valiant [30]. We re-visit and extend their results and develop two modifications to the χ-test whose performance we characterize. For testing Lipschitz densities, we show that the usual binning tests are inadequate in the low-smoothness regime and we design a spatially adaptive partitioning scheme that forms the basis for our locally minimax optimal tests. Furthermore, we provide the first local minimax lower bounds for this problem which yield a sharp characterization of the dependence of the critical radius on the null hypothesis being tested. In the low-smoothness regime we also provide adaptive tests, that adapt to the unknown smoothness parameter. We illustrate our results with a variety of simulations that demonstrate the practical utility of our proposed tests.
منابع مشابه
Hypothesis Testing for High-Dimensional Multinomials: A Selective Review
The statistical analysis of discrete data has been the subject of extensive statistical research dating back to the work of Pearson. In this survey we review some recently developed methods for testing hypotheses about high-dimensional multinomials. Traditional tests like the χ-test and the likelihood ratio test can have poor power in the high-dimensional setting. Much of the research in this a...
متن کاملA New Method for Sperm Detection in Human Semen: Combination of Hypothesis Testing and Local Mapping of Wavelet Sub-Bands
Introduction Automated methods for sperm characterization in microscopic videos have some limitations such as: low contrast of the video frames and possibility of neighboring sperms to touch each other. In this paper a new method is introduced for detection of sperms in microscopic videos. Materials and Methods In this work, first microscopic videos are captured from specimens of human semen. S...
متن کاملOptimal Calibration for Multiple Testing against Local Inhomogeneity in Higher Dimension
Based on two independent samples X1, ...,Xm and Xm+1, ...,Xn drawn from multivariate distributions with unknown Lebesgue densities p and q respectively, we propose an exact multiple test in order to identify simultaneously regions of significant deviations between p and q. The construction is built from randomized nearest-neighbor statistics. It does not require any preliminary information abou...
متن کاملOn the Minimax Optimality of Block Thresholded Wavelets Estimators for ?-Mixing Process
We propose a wavelet based regression function estimator for the estimation of the regression function for a sequence of ?-missing random variables with a common one-dimensional probability density function. Some asymptotic properties of the proposed estimator based on block thresholding are investigated. It is found that the estimators achieve optimal minimax convergence rates over large class...
متن کاملMinimax testing of a composite null hypothesis defined via a quadratic functional in the model of regression
We consider the problem of testing a particular type of composite null hypothesis under a nonparametric multivariate regression model. For a given quadratic functional Q, the null hypothesis states that the regression function f satisfies the constraint Q[f ] = 0, while the alternative corresponds to the functions for which Q[f ] is bounded away from zero. On the one hand, we provide minimax ra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1706.10003 شماره
صفحات -
تاریخ انتشار 2017